Skip to content

refactor(pgvector): use async DocumentStore mixin tests#3094

Open
SyedShahmeerAli12 wants to merge 10 commits intodeepset-ai:mainfrom
SyedShahmeerAli12:refactor/pgvector-async-mixin-tests
Open

refactor(pgvector): use async DocumentStore mixin tests#3094
SyedShahmeerAli12 wants to merge 10 commits intodeepset-ai:mainfrom
SyedShahmeerAli12:refactor/pgvector-async-mixin-tests

Conversation

@SyedShahmeerAli12
Copy link
Copy Markdown
Contributor

Related Issues

Closes #3050

What this changes

Refactors integrations/pgvector/tests/test_document_store_async.py to inherit from the async mixin classes introduced in deepset-ai/haystack#10799.

  • Removes 17 duplicate tests now covered by the mixins
  • Keeps all pgvector-specific tests (blob write, connection recreation, invalid connection string, empty meta validation, hnsw index recreation, table management)

Closes deepset-ai#3052

Replaces duplicate async test implementations with the mixin classes
from haystack.testing.document_store and haystack.testing.document_store_async.

- Removes 19 tests now covered by the mixins
- Keeps all Qdrant-specific tests (hybrid search, sparse config, collection
  setup validation, vector preservation)
Closes deepset-ai#3050

Replaces duplicate async test implementations with mixin classes
from haystack.testing.document_store and haystack.testing.document_store_async.

- Removes 17 tests now covered by the mixins
- Keeps all pgvector-specific tests (blob write, connection recreation,
  invalid connection, empty meta validation, hnsw index, table management)
@SyedShahmeerAli12 SyedShahmeerAli12 requested a review from a team as a code owner April 2, 2026 15:22
@SyedShahmeerAli12 SyedShahmeerAli12 requested review from sjrl and removed request for a team April 2, 2026 15:22
@github-actions github-actions bot added integration:qdrant integration:pgvector type:documentation Improvements or additions to documentation labels Apr 2, 2026
@sjrl
Copy link
Copy Markdown
Contributor

sjrl commented Apr 10, 2026

@SyedShahmeerAli12 thanks for the contribution. It seems though this PR is touching qdrant files even though it should only be for pgvector right?

This PR only covers pgvector. The qdrant async test refactor
was accidentally included - reverting to keep scope correct.
@github-actions
Copy link
Copy Markdown
Contributor

Coverage report (qdrant)

This PR does not seem to contain any modification to coverable code.

get_metadata_field_min_max and its async counterpart now return
{"min": None, "max": None} when the field is not present in the
store (e.g. empty collection) instead of raising ValueError.

Fixes test_get_metadata_field_min_max_empty_collection_async.
For numeric fields (integer, real), returns numeric min/max.
For text fields, returns lexicographic min/max based on database collation.
:raises ValueError: If the field doesn't exist or has no values.
Returns ``{"min": None, "max": None}`` when the field has no values or the store is empty.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use double back ticks in our api docstrings only single backticks

Suggested change
Returns ``{"min": None, "max": None}`` when the field has no values or the store is empty.
Returns `{"min": None, "max": None}` when the field has no values or the store is empty.

…ty store

The hardcoded 'content' entry was always included in fields_info even
when the store had no documents. The mixin contract (and common sense)
requires {} for an empty store. Removed the hardcoded initialisation
and updated the local test to match.
pgvector stores embeddings as float32, so round-tripped embeddings lose
precision and exact Document equality fails. The same override already
exists in test_filters.py for the sync FilterDocumentsTest — applying
the same pattern to TestDocumentStoreAsync.
Two fixes:
1. Move 'import pytest' from inside assert_documents_are_equal to
   top of file (PLC0415 lint rule).
2. Override test_count_not_empty_async: the haystack mixin defines
   this method without '@staticmethod' or 'self', so pytest passes
   the class instance as 'document_store'. Adding a proper 'self'
   override makes fixture injection work correctly.
…viour

The mixin raises NotImplementedError by design — each store must
declare its own default duplicate policy. pgvector raises
DuplicateDocumentError on a second write without an explicit policy,
matching the sync test_write_documents override.
CountDocumentsByFilterAsyncTest, CountUniqueMetadataByFilterAsyncTest,
and UpdateByFilterAsyncTest were added to haystack.testing.document_store
in 2.27.0. The lowest-direct-deps CI run was resolving to 2.26.1 and
failing with ImportError.
@github-actions
Copy link
Copy Markdown
Contributor

Coverage report (pgvector)

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  integrations/pgvector/src/haystack_integrations/document_stores/pgvector
  document_store.py 1672-1675, 1851-1859, 1883-1891
Project Total  

This report was generated by python-coverage-comment-action

@SyedShahmeerAli12
Copy link
Copy Markdown
Contributor Author

All tests passing now. Sorry for the initial noise the qdrant file was accidentally included and has been reverted.

A few fixes were needed along the way: get_metadata_field_min_max and get_metadata_fields_info now return empty/None results for an empty store instead of raising; added assert_documents_are_equal override to handle float32 precision loss; overrode two mixin methods that had fixture injection issues; and bumped haystack-ai>=2.27.0 since the async test mixin classes were introduced there.

@sjrl @davidsbatista

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use async DocumentStore mixin tests in pgvector

2 participants